Learning Deep Energy Models: Contrastive Divergence vs. Amortized MLE
نویسندگان
چکیده
We propose a number of new algorithms for learning deep energy models from data motivated by a recent Stein variational gradient descent (SVGD) algorithm, including a Stein contrastive divergence (SteinCD) that integrates CD with SVGD based on their theoretical connections, and a SteinGAN that trains an auxiliary generator to generate the negative samples in maximum likelihood estimation (MLE). We demonstrate that our SteinCD trains models with good generalization (high test likelihood), while SteinGAN can generate realistic looking images competitive with GAN-style methods. We show that by combing SteinCD and SteinGAN, it is possible to inherent the advantage of both approaches.
منابع مشابه
Particle Filtered MCMC-MLE with Connections to Contrastive Divergence
Learning undirected graphical models such as Markov random fields is an important machine learning task with applications in many domains. Since it is usually intractable to learn these models exactly, various approximate learning techniques have been developed, such as contrastive divergence (CD) and Markov chain Monte Carlo maximum likelihood estimation (MCMC-MLE). In this paper, we introduce...
متن کاملLearning to Draw Samples: With Application to Amortized MLE for Generative Adversarial Learning
We propose a simple algorithm to train stochastic neural networks to draw samples from given target distributions for probabilistic inference. Our method is based on iteratively adjusting the neural network parameters so that the output changes along a Stein variational gradient (Liu & Wang, 2016) that maximumly decreases the KL divergence with the target distribution. Our method works for any ...
متن کاملLearning to Sample Using Stein Discrepancy
We propose a simple algorithm to train stochastic neural networks to draw samples from given target distributions for probabilistic inference. Our method is based on iteratively adjusting the neural network parameters so that the output changes along a Stein variational gradient [1] that maximumly decreases the KL divergence with the target distribution. Our method works for any target distribu...
متن کاملSimilarity-based Contrastive Divergence Methods for Energy-based Deep Learning Models
Energy-based deep learning models like Restricted Boltzmann Machines are increasingly used for real-world applications. However, all these models inherently depend on the Contrastive Divergence (CD) method for training and maximization of log likelihood of generating the given data distribution. CD, which internally uses Gibbs sampling, often does not perform well due to issues such as biased s...
متن کاملTraining Restricted Boltzmann Machines with Overlapping Partitions
Restricted Boltzmann Machines (RBM) are energy-based models that are successfully used as generative learning models as well as crucial components of Deep Belief Networks (DBN). The most successful training method to date for RBMs is the Contrastive Divergence method. However, Contrastive Divergence is inefficient when the number of features is very high and the mixing rate of the Gibbs chain i...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1707.00797 شماره
صفحات -
تاریخ انتشار 2017